Warn and exit non-zero on Podman/Infisical secret drift#37
Merged
Conversation
`_register_secrets` only deletes-then-recreates the names it is given, so any `<workload>--*` Podman secret that falls out of the fetch persists. It still resolves via the shell driver, but `_generate_drop_in` only writes `Secret=` lines for keys in the current fetch, so containers boot without the matching env var. This failed silently when a workload's secrets moved into an Infisical subfolder and `recursive: true` was not set — the drop-in regenerated without those keys, the stale Podman secrets stayed functional, and nobody noticed until a container broke. Fix the silence: - Between `_register_secrets` and `_generate_drop_in`, compare the `<workload>--*` namespace against the fetched set and log a WARNING per stale name with a one-line remediation pointer. - Accumulate drift across workloads; `run_setup` raises `DriftDetectedError` at the end so the setup systemd unit (and `psi cache refresh`) exit non-zero. - Extend `psi setup --dry-run` to diff each workload's drop-in `Secret=` targets against its `<workload>--*` Podman secrets and report both directions per workload.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
psi setupnow detects when<workload>--*Podman secrets exist thataren't in the current Infisical fetch. Each stale name is logged as a
WARNING with a one-line remediation pointer, and
run_setupraisesDriftDetectedErrorat the end so the systemd unit exits non-zero.psi setup --dry-rungains a "Workload drift" section that diffs eachworkload's drop-in
Secret=targets against its<workload>--*Podmansecrets, reporting stale Podman secrets and dangling drop-in refs per
workload.
Why
_register_secretsonly deletes-then-recreates the names it's given, soany Podman secret that falls out of the fetch persists. It still resolves
via the shell driver, but
_generate_drop_inomits it, so containers bootwithout the env var — silently. This bit us when secrets moved into an
Infisical subfolder and
recursive: truewas not set on the source:drop-ins regenerated without those keys, stale Podman secrets kept
resolving, and the failure surfaced as an unrelated port collision 65
minutes into a reboot.
Recursion stays opt-in; this PR just makes the drift loud instead of
changing defaults.
Test plan
uv run pytest -q— 366 passeduv run ruff check psi/ tests/uv run ruff format --check psi/ tests/uv run ty checka path not covered by any source, run
psi setup, confirm WARNING inthe journal and non-zero exit.
psi setup --dry-runagainst the homelab config — confirm the new"Workload drift" section lists the known stale
windmill-*--MODE/--NUM_WORKERS/--WORKER_GROUPentries.